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I.  OVERVIEW 


X 


The  first  quarter  of  the  OCCAM  contract  effort  was  especially  fruitful. 
Several  pure  results  were  obtained  in  optical  conceptual  computing  and 
associative  memories.  These  results  include  the  formal  definitions  of  and 
theorems  on  bidirectional  associative  memories  and  fuzzy  associative 
memories,  the  f irst-principles  proof  that  differential  Hebbian  learning 
subsumes  standard  Hebbian  learning,  the  construction  of  a  new  optically 
computable  fuzzy  integral,  a  quantitative  theory  of  fuzzy  cognitive  map 
combination  and  inferencing,  the  design  and  preliminary  simulation  of  a 
novel  all-optical  dynamical  associative  memory,  the  first  optical  design  for 
implementing  the  fundamental  fuzzy  set/logic  operations  of  pairwise  minimum 
and  maximum,  the  design  and  preliminary  simulation  of  a  translation, 
rotation,  and  scale  invariant  optical  preprocessor  suitable  for  pattern 
recognition  by  associative  memory,  and  the  design  and  construction  of  an 
associative  memory  demonstration  computer  board.  Several  of  these  results 
are  currently  in  preparation  as  technical  papers.  Some  have  been  presented 
at  professional  speaking  engagements.  - 


The  VERAC  and  UCSD  research  teams  worked  closely  together  on 
essentially  a  daily  basis  during  the  the  1st  Quarter.  Many  insights  and 
improvements  were  jointly  obtained.  Nevertheless,  there  was  sufficient 
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division  of  research  labor  to  warrant  a  discussion  of  1st  Quarter  OCCAM 


activities  in  separate  VERAC  and  UCSO  sections,  bearing  in  mind,  again,  the„ 
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close  interaction  of  the  researchers. 
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Bart  KosKo  directs  the  VERAC  OCCAM  effort.  Robert  Sasseen  provides 
simulation  and  analysis  support.  Robert  and  all  the  OCCAM  research  team 
have  successfully  completed  Bart's  Fuzzy  Theory  course  at  UCSD.  The  key 
research  results  and  activites  of  VERAC’ s  OCCAM  effort  are  listed  below. 


1.  Bidirectional  Associative  Memory.  Bart  defined  bidirectional  stability 
and  proved  that  every  matrix  is  a  bidirectional  associative  memory  (BAM). 
With  Clark  Guest  he  devised  a  basic  phase-conjugate  resonator  implementation 
of  a  BAM.  The  paper  "Bidirectional  Associative  Memories"  is  in  preparation. 


Let  M  be  an  arbitrary  n-by-p  real  matrix.  Let  A  be  an  n-dimensional 
binary  vector  and  B  be  a  p-dimensional  vector.  Let  M( . . . )  and  MT( . . . )  be 
nonlinear  operators  that  depend  on  M  and  MT,  where  MT  is  the  matrix 
transpose  of  M.  Suppose  B  -  M( A ) ,  A'  »  MT(B),  B'  -  M(A'),  A”  -  MT(B’), 
and  so  on.  Then  M  is  bidirectionally  stable  if  for  every  initial  pair  (A, 
B)  there  exists  a  fixed  pair  (Af,  Bf)  such  that 

B  -  M( A) ,  A’  -  MT(B) . Bf  <■  M(Af ),  Af  -  MT(Bf),  Bf  -  M(Af) . 

Hence  if  M  is  bidirectionally  stable,  M  behaves  a  heteroassociotive  content 
addressable  memory  (CAM).  We  know  of  no  other  heteroassociotive  CAM  in  the 
literature. 


Which  matrices  are  bidirectionally  stable?  That  depends  on  the  how  the 
nonlinear  operators  M( . . . )  and  MT( . . . )  are  specified.  Surely  the  simplest 
operators  are  threshold  linear: 


if  B  MiT  > 


if 


B  M, 


if 


A  MJ 


>  0 


if 


A  MJ 


where  Mi  is  the  ith  row  (column)  of  M  (MT)  and  M^  is  the  jth  column  of  M. 
I.e.,  vector  multiply  M  by  A  then  hardclip  to  produce  a  binary  B,  vector 
multiply  Mt  by  B  then  hardclip  to  produce  a  binary  A’,  and  so  on.  This 
process  can  be  interpreted  as  the  synchronous  interaction  of  two  Grossberg 
fields  of  McCulloch-Pitts  neurons  FA  -  (a^,  .  .  .  ,  an)  and  FB  -  (b^  .  .  . 
,  bp)  symmetrically  interconnected  via  the  synaptic  weights  (mij). 
Asynchronous  neuronal  state  changes  are  also  permitted. 


If  M( . . . )  and  MT( . . . )  are  interpreted  as  threshold-linear  adjoint 
operators,  then  our  question  has  been  answered  with  a  decisive  theorem: 
Every  matrix  is  bidirectionally  stable.  The  theorem  is  proven  by 
identifying  a  Lyapunov  or  energy  with  the  operation  of  the  bidirectional 
threshold-linear  operator.  The  correct  energy  potential  turns  out  to  be 


E(A.  B) 


-1/2  A  M  BT  -  1/2  B  MT  AT  . 

Observe  that  B  MT  AT  -  B  (A  M)T  -  (A  M  BT)T  -  AM  BT,  where  the  last 
inequality  follows  since  the  transpose  of  a  scalar  equals  the  scalar.  Hence 
E  is  equivalent  to 

E( A  ,  B)  -  -  A  M  BT  . 

We  can  then  show  that  the  energy  change  E2  -  E-j  due  to  a  state  change  in 
neuron  a^,  or  the  entire  neuron  vector  A,  is  negative.  Since  E  is  bounded 
below  by  the  negative  of  the  sum  of  the  absolute  values  of  the  entries  of  M, 
E  converges  to  a  local  energy  minimum.  Since  M  was  arbitrary,  the  theorem 
follows  for  both  synchronous  and  asynchronous  state  changes.  Hence  every 
matrix  M  can  be  decoded  as  an  associative  memory. 

This  result  subsumes  the  result  of  Hopfield  et  al  that  a  square 
symmetric  zero-diagonal  matrix  is  unidirectionally  stable  (an  autoassocative 
CAM).  The  Hopfield  case  follows  if  M  is  square  symmetric  with  zero  (or  more 
generally  nonnegative)  diagonal  elements  and  A  «  B.  Moreover,  in  general 
the  Hopfield  associative  memory  is  stable  only  for  asynchronous  (serial) 
recall,  a  serious  restriction  that  does  not  apply  to  a  BAM.  For  instance, 
one  of  the  simplest  Hopfield  associative  memories  stores  the  vector  (1  0)  as 


Now  multiply  this  M  by  the  bipolar  vector  X  -  (1  1).  This  gives  Xc  »  (-1  - 


1).  But  Xc  M  «  (11).  Hence  the  iterative  recall  procedure  forever 

oscillates  or  blinks  back  and  forth  between  X  and  Xc.  In  other  words,  in 
synchronous  operation  (vector  multiplication),  M  is  unidirectionally 
unstable  but  bidirectionally  stable! 

The  BAM  storage  algorithm  allows  m-many  heteroassociative  pairs  (Ait 
B^ )  to  be  encoded  in  M  by  suitably  sculpting  the  energy  surface  defined  by 
D.  Transform  the  binary  pair  (A^,  B^)  into  the  bipolar  pair  (X^,  Y^. 
Memorize  the  vector  pair  by  forming  the  correlation  matrix  XAT  Y^ .  (Since 
X±c  *  -Xj,  this  encoding  technique  also  memorizes  the  complement 
association  (X^c,  Y^c)  since  (X^C)T  Y^c  «  X^T  Y^.)  Form  M  now  by 
superimposition :  simply  pointwise  add  the  m  correlation  matrices  X^T  Y^. 

We  observe,  as  most  have  overlooked  when  using  correlation  memorization 
techniques,  that  the  BAM  encoding  algorithm  employs  Grossberg  reciprocal 
outstar  coding.  Indeed  the  BAM  operation  corresponds  to  a  simple  form  of 
Grossberg  adaptive  resonance  between  fields  FA  and  Fg. 

It  follows  from  BAM  analysis  that  the  storage  capacity  of  M  is  given  by 
a  generalization  of  the  familiar  bound  for  autoassociators : 

m  <  min(n,  p) 

for  reliable  coding. 

Finally,  we  comment  that  in  the  OCCAM  2nd  Quarter  research  effort  a 
continuous/differentiable  version  of  the  BAM  theorem  has  been  proved.  The 


difference-equation  version  of  this  theorem  allows  fuzzy  unit,  or  fit, 
vectors  (with  element  values  in  [0,  1])  to  be  used  in  the  BAM  recall  and 

•ji 

storage  procedure.  With  Clark  Guest  a  preliminary  phase-conjugate  resonator 
BAM  implementation  has  been  constructed. 

2.  Fuzzy  Associative  Memory.  Fuzzy  associative  memory  (FAM)  is  term 
occasionally  used  (for  instance  in  TRW’s  so  called  FAM  (WAM)  VHSIC 
processor)  but  seldom  defined.  Bart  defined  a  FAM  as  a  fuzzy  relation  M 
that  maps  input  fuzzy  sets  A  to  output  fuzzy  sets  B,  and  proved  that  every 
fuzzy  heteroassociative  pair  (A,  B)  can  be  stored  and  recalled  with  perfect 
reliability  in  an  easily  construted  relation  M.  M  is  realized  by  an  n-by-p 
matrix  of  elements  in  [0,  1],  i.e.,  a  point  in  the  space  [0,  1]nxP.  A  is  an 
n-fit  vector  and  B  is  a  p-fit  vector,  i.e.,  A  and  B  are  respectively  points 
in  the  unit  hypercubes  [0,  1]n  and  [0,  1]P.  These  results  are  included  in 
"Fuzzy  Associative  Memories,"  in  preparation,  invited  to  appear  in  a  special 
Addison-Wesley  edition  on  fuzzy  expert  systems. 

The  key  insight  is  that  association  generalizes  the  familiar  logical 
operation  of  modus  ponens — if  A  and  A  — >  B,  then  B.  The  (fuzzy)  logical 
conditional  A  — >  B  stores  the  pair  (A,  B).  When  a  key  C  is  applied  to  the 
memory,  B  is  recalled  if  C  -  A.  More  generally  if  C  is  approximately  A, 
then  B’  is  recalled  where  B’  is  approximately  B.  Storing  the  pair  (A,  B) 
corresponds  to  heteroassociative  memory.  As  a  special  case,  storing  the 
redundant  pair  (A,  A)  corresponds  to  autoassociative  memory. 

The  fundamental  fuzzy  operation  of  set-relation  composition  (analogous 
to  vector-motrix  multiplication)  is  max-min  composition.  A  o  M  denotes  max- 


min  composition.  The  operation  A  o  M  is  performed  by  intersection  the  fuzzy 
set  A  with  the  fuzzy  sets  of  M  represented  as  columns.  This  is  directly 
analogous  to  the  vector  operation  A  M  where  the  vector  A  multiplies  the 
columns  of  M.  In  particular  if  A  o  M  -  B,  the  jth  element  of  fuzzy  set  B  is 
found  by  taking  the  maximum  of  the  pairwise  minima  of  A's  fit  values  with 
the  fit  values  of  the  jth  column  of  M: 

bj  «  max{  min(a-) ,  m-jj),  ....  min(an,  mn  j )  >  . 

This  directly  analogous  to  the  vector-multiply  operation  of  taking  the 
global  sum  of  pairwise  products.  One  difference,  however,  that  min  and  max 
do  not  disturb  the  data  on  which  they  act.  They  only  effect  order.  Hence 
if  B  *  A  o  M,  then  every  element  of  B  is  some  element  of  A  or  M.  We  note 
that  M  is  in  fact  the  conditional  possibility  distribution  of  B  given  A. 

We  now  briefly  state  our  results.  Suppose  we  wish  to  memorize  the 
fuzzy  set  A-  (.3  1  .4  .7).  Hence  we  wish  to  store  the  autoassociative 

pair  (A,  A)  in  M.  The  Compositional  Rule  of  Inference,  propounded  by  Zadeh, 


says  that  we  form  the  relation  (conditional  possibility  distribution)  M  by 
identifying  m^j  with  the  pointwise  fuzzy  logical  implication  or  truth  value 
t^ j .  With  this  we  agree.  However,  Zadeh  et  al  suggest  the  Lukasawiecz 
implication  value  t^j  •  min(1,  1  -  a^  +  bj )  (where  in  the  autoassociative 
case  bj  -  Oj).  This  gives  the  fuzzy  relation  M: 


1  1 
.4  .7 


1  1 
i  v 


and  hence  in  fact  A  o  M  -  A.  However,  this  is  only  true  because  this 
technique  always  produces  Is  along  the  main  diagonal.  Moreover,  t^j  -  1  if 
(and  only  if)  a^  <  b  j ,  which  tends  to  occur  at  least  as  often  as  ai  >  bj 
occurs.  Hence  M  tends  to  consist  of  Is.  This  precludes  heteroassociative 

recall  reliability,  tending  to  produce  B  -  (1,  1 . 1)  for  any  input 

A. 

We  have  proven  that  the  correct  implication  operator  is  tjj  «  minCa^, 
bj),  which  is  essentially  a  fuzzy  Hebb  law.  This  is  equivalent  to 
representing  M  as  the  fuzzy  cartesian  product  M«AxB»AToB.  This 
produces  the  memory  M: 


and  hence  again  A  o  M  -  A.  There  are  two  key  properties  at  work  in  this 
selection  of  M.  First,  A  -  Diagonal(M)  since  tjj  «  min(a^,  a^  «  a^ 
Second,  the  diagonal  entries  always  dominant  the  column  entries:  m^  >  mj 
for  all  j,  for  each  i,  which  again  follows  from  the  nature  of  the  minimum 
operation.  Our  autoassociative  theorem  states  that  A  can  be  perfectly 
memorized  by  an  M  such  A  «  Diagonal(M)  and  M  is  diagonal  dominant.  Hence  A 
o  A  works,  as  well  as  the  simpler  choice  of  M  that  lists  A  down  the  main 
diagonal  and  puts  Os  elsewhere.  Another  theorem  says  that  for  all  other  A’ 
A’  o  M  (z.  A,  i.e.,  the  elements  of  A*  o  M  are  pairwise  dominated  by  the 


elements  of  A.  Hence  M  is  a  subset  classifier  as  opposed  to  the  more 


specialized  (and  more  popular)  metric  classifiers.  The  further  A*  is  A,  the 
more  A’  o  M  approaches  the  empty  set  (0,  0 . 0). 

Our  heteroassociative  theorem  says  that  m^j  »  min(a^,  b j )  perfectly 
memorizes  (A,  B)  subject  to  one  condition:  for  every  element  bj  of  B,  there 
exists  some  element  a^^  of  A  such  that  bj  <  a^  Note  that  if  A  and  B  are 
binary,  this  condition  is  always  satisfied  since  only  A  -  (0,  0,  .  ...  0) 
could  violate  it,  but  then  AT  B  produces  the  zero  matrix!  Moreover,  M  is  a 
subset  classifier  since  for  all  A*,  A'  o  M  <=.  B.  We  comment  that  MT  also 
produces  the  bidirectional  memory  relation  subject  to  the  same  dominance 
condition.  We  also  comment  that  the  recent  ATST  Bell  Labs  fuzzy  logic  VLSI 
chip  implements  our  fuzzy  associative  memory  without  realizing  it!  Through 
personal  communication  with  its  developer,  Masaki  Togai  (now  at  Rockwell), 
we  learned  that  the  chip  designers  selected  the  min  operation  after 
exhaustive  simulations  because  only  it  worked  as  an  implication  operator. 

store  the  pair  (A,  B)  where  A  -  (.3  1  .4  .7)  as 

The  key  condition  of  the  theorem  is  satisfied  since 

» 

M  -  AT  B, 


Suppose  we  wish  to 
before  and  B  «  (.5  .2). 
a  1  occurs  in  A.  Hence 


t? 


I 


,V  A  #  V 


perfectly  learns  the  association  since  A  o  M  -  B.  Suppose  we  present  the 
partial  pattern  A’  -(.3  0.4  0).  Then  A’oM  ■  (.3  .2)  cz  B.  The 

bidirectional  FAM  will  be  a  good  but  suboptimal  memory  since  the  Key 
condition  is  not  satisfied:  no  element  in  B  is  at  least  as  large  as  02  ■  1. 
Hence  B  o  MT  »  A’1  -  (.3. 5. 4. 5)  <=  A. 

The  FAM  theorems  can  be  viewed  as  new  theorems  in  the  new  field  of 
fuzzy  eigensets.  Our  construction  technique  shows  how  to  find  the  fuzzy 
relation  M  that  has  a  given  fuzzy  set  A  as  an  eigenset  in  the  sense  of  max- 
min  composition:  A  o  M  -  A.  It  turns  out,  however,  that  the  m-many  FAMs 
Mj^  storing  the  pairs  (Ait  B^ )  cannot  simply  added  or  maxed  together  to  store 
the  pairs.  The  eigenset  property  is  too  pervasive.  For  instance,  suppose  M 
is  formed  by  a  pointwise  maximum  operation:  M  «  maxCM^  ....  M,,,). 

Then 


M  -  (max( A-] ,  ....  Am))T  o  max(B.| . Bffl)  , 

in  other  words,  the  eigenset  pair  of  M  is  (max  A^ ,  max  B^ ) .  Hence  all 
patterns  (A,  B)  get  mapped  into  a  subset  of  the  max  pair.  If  pairwise  max 
is  replaced  with  normalized  addition,  then  all  pairs  get  mapped  into  subsets 
of  the  normalized  sum  pair,  and  so  forth.  So  although  these  FAMs  permit 
parallel  distributed  storage  of  patterns,  these  storage  media  cannot  be 
naively  superimposed.  For  many  applications,  including  the  AT&T  Bell  Labs 
fuzzy  logic  chip,  such  superimposition  is  not  necessary. 


5.  Differential  Hebbian  Subsumption  of  Hebbian  Learning.  We  report  the 
first  derivation  of  Hebbian  learning  phenomena  that  we  know  of  in  the 
literature.  Bart  derived  this  simple  but  powerful  result  in  his  work  on 
kinetic-energy  Lyapunov  functions  for  neural  networks. 

Hebb  postulated  that  synaptic  change  is  driven  by  the  correlation  of 
pre-synaptic  and  post-synaptic  activity.  If  we  denote  the  nonnegative 
activation  of  neuron  i  by  X^(t)  at  time  t  and  the  directed  edge  or  synapse 
from  X^  to  Xj  by  the  real-valued  function  e^j(t)  at  time  t,  then  the  Hebbian 
learning  law  takes  the  form 


-e 


ij 


Xi  *j 


(D 


where  the  "forget  function"  _®ij  has  been  appended  to  represent  the  passive 
decay  of  neurotransmitter  release  when  X^  and  Xj  are  inactive.  Equation  (1) 
takes  many  forms  in  the  literature — ether  terms  are  added,  functions  of  X^ 
and  Xj  are  correlated,  etc. — but  the  recurring  structure  is  that  concurrent 
(or  lagged)  activation  drives  learning.  Unfortunately  the  activation 
product  in  (1)  grows  synaptic  connections  between  neurons  at  an  exponential 
clip.  If  a  forget  term  is  not  added,  or  if  it  has,  as  experimentally  it 
seems  to  have,  a  small  weight,  then  e^j  rapidly  saturates  at  its  maximum 
positive  strength.  Total  connectivity  results  ond  no  learning  occurs.  This 
is,  we  add,  an  abundant  simulation  phenomenon  among  neural  net  researchers. 


The  differential  Hebbian  hypothesis  is  that  concurrent  (or  logged) 


change,  or  concomitant  variation,  drives  learning: 


We  stress,  however,  that  the  forget  term  need  not  occur  in  (2).  It  is  added 


for  sake  of  comparison.  For  instance,  in  causal  reasoning,  causal 
connections  do  not  passively  decay  away!  So  for  sake  of  contrast  let  us 
replace  (1)  and  (2)  with  (3)  and  (4): 


eij  "  Xi  Xj  ’ 


(3) 


xi  xj 


(4) 


Hence  in  (4)  learning  is  governed  by  a  natural  correlation  sign  law:  the 
edge  strengthens  if  the  activations  agree  in  sign  and  weakens  or  tends  to  be 
inhibitory  if  they  disagree  in  sign. 


Which  learning  law  is  more  accurate?  Through  personal  communication, 
we  have  found  that  many  researchers  pursuing  this  question  answer  it  with  a 
natural,  yet  diplomatic,  compromise: 


Xi  Xj 


Xi  Xj 


(5) 


This  model  seems  natural  enough — just  add  first,  and  perhaps  second,  order 
variables.  Indeed  it  is  natural,  but  we  can  now  prove  it  is  an  unavoidable 
restatement  of  (4). 


Our  argument  focuses  on  the  relatively  noncontroversial  model  of 
shortterm  memory  or  activation: 


-a  Xi  + 


£v  s<v  ♦ 


-a  Xi  +  0±  . 


where  a  >  0  is  the  shortterm  memory  decay  constant  (function),  S  is  some 


nondecreasing,  usually  sigmoid,  signal  function,  and  0^  represents  other 


terms.  Then  upon  substitution  of  (6)  into  (4),  we  obtain  the  unambiguous 


prediction  that  some  Hebbian  learning  behavior  occurs: 


•ij  “  Xi  XJ 


a*  X,  X4  +  0. 


where  denotes  other  terms  in  the  learning  equation  (that  may  or  may  not 


be  positive).  Upon  rearrangement  and  rescaling  as  necessary,  we  see  that 


(7)  is  in  fact  equivalent  to  (5). 


Equation  (7)  summarizes  a  new  synaptic  theory.  It  predicts  where  and 


to  what  extent  Hebbian  learning  occurs  and  quantitatively  suggests  why  a 


strict  Hebb  law  conflicts  with  neurophysiological  data.  For  example,  if  a 


regression  analysis  is  performed  on  synaptic  behavior,  we  expect  the 


explanatory  contribution  of  the  Hebb  component  to  be  negligible  when  the 


behavior  does  not  involve  (much)  passive  decay.  Since  activation  decay  is 


fundamental  to  both  neural  and  causal  processes,  the  Hebbian  prediction  of 


(7)  is  quite  robust.  It  even  suggests  where  to  search  for  on  electro- 


chemical  mechanism  for  Hebbian  behavior,  namely  in  the  interaction  of  two 
resting-potential  media  along  a  conducting  medium. 


4.  A  New  Fuzzy  Integral  and  Expectation  Operator.  How  can  a  function  f  be 
integrated  over  a  fuzzy  set  A?  Traditional  fuzzy  answers  to  this  question 
have  produced  noncomputable  sup-min  structures  where  the  supremum  is  taken 
over  an  uncountably  infinite  interval,  usually  [0,  1].  We  find  this 
approach  unfruitful  for  a  variety  reasons.  Bart  has  developed  an 
alternative  theory  of  integration. 

We  propose  an  abstract  fuzzy  integral  defined  in  terms  of  the  positive 

measure  Sigma-Count.  The  sigma  count  of  a  fuzzy  set  is  simply  the  sum  of 

the  fit  values.  For  instance,  Sigma-Count( . 2  .3  1  .8)  -  2.3;  hence  fuzzy 

cardinality  can  be  a  real  number,  not  just  an  integer.  Though  it  is  beyond 

the  scope  of  this  R4D  Status  Report,  we  define  the  fuzzy  integral  of  f  over 
A  to  be  a  sum  of  products — the  product  of  f^  with  the  Sigma-Count  of  A 
intersected  with  all  the  points  x  such  that  f(x)  •  f^. 

We  define  the  fuzzy  expectation  of  f  with  respect  to  A  as  simply  the 
sum  of  the  products  f(x)  mA(x)  ,  where  mA(x)  is  the  degree  of  membership, 
or  fit  value,  of  x  in  the  subset  A.  The  point  is  that,  unlike  the 
probabilistic  expectation,  we  do  not  require  that  Sigma-Count( A)  -  1. 

Our  fundamental  theorem  is  that  this  Sigma-Count  fuzzy  integral  equals 
this  intutive  fuzzy  expectation  operator!  Besides  the  many  theoretical 


problem  this  solves,  it  makes  a  fuzzy  integral  easy  to  compute,  often  by 
hand.  For  instance,  suppose  the  domain  X  -  <1,  2,  3,  4)  and  A  is  given  as 
before  by  A  •  (.2  .3  1  .8).  Then  if  f  is  the  squaring  function,  f(x)  -  x2, 
then  EA(f)  -  .2x1  +  .3x4  +  1x9  +  .8x4  -  13.6. 

This  theory  is  presented  in  the  paper,  "Fuzzy  Expectations,"  also  in 
preparation.  E^|,  and  hence  the  fuzzy  integral  equivalent  to  it,  are 
obviously  trivial  to  optically  implement. 


5.  Fuzzy  Knowledge  Combination  Theory  for  Arbitrary  Many  Fuzzy  Cognitive 
Maps — Hidden  Patterns.  Bart  developed  a  new  theory  for  combining  arbitrary 
many  fuzzy  cognitive  maps  (FCMs)  obtained  from  arbitrary  many  experts  of 
arbitrary  credibility.  These  results  are  in  the  paper  "Adaptive  Cognitive 
Processing,"  also  in  preparation 

We  limit  the  discussion  to  simple  FCMs.  These  are  fuzzy  signed 
digraphs.  An  edge  e^j  from  concept  variable  C^  to  concept  variable  Cj  has  a 
weight  or  degree  of  causal  strength  in  the  fuzzy  causal  interval  [-1,  1]. 
e^j  -  0  indicates  no  causal  connection,  e^j  >  0  indicates  that  Cj  causally 
increases  C j ;  the  larger  e^j,  the  more  C^  increases  C j .  e^j  <  0  indicates 
that  C^  causally  decreases  Cj — C^  up  implies  Cj  down,  C^  down  implies  Cj  up. 
A  FCM  can  be  represented  by  the  fuzzy  relation  or  square  matrix  F,  where  f^j 


Suppose  k-many  experts  represent  their  knowledge  of  some  complex 
situation  in  k-many  FCMs  F^  of  different  square  dimensions.  What  have  the 
experts  given  us?  How  can  we  combine  their  knowledge?  What  con  we  do  with 
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it  when  we  have  combined  it?  Nontrivial  answers  to  these  questions  follow 
from  a  simple  examination  of  fundamentals. 


We  observe  that  the  complex  situation  represented  can  contain  a  mix  of 
factual  and  conceptual  variables  deeply  interconnected  through  partial 
causality.  The  factual  variables  might  include  agricultural  exports  or 
enemy  mortality.  The  conceptual  variables  might  include  stress,  utility,  or 
love.  What  sorts  of  inferences  do  people  draw  from  such  entangled  fuzzy 
concepts?  How  do  they  conceptually  compute?  Surely  they  associate  input 
patterns  with  predicted  or  output  patterns.  Although  some  people  can 
articulate  some  of  the  serial  causal  paths  in  their  inferences  about  complex 
phenomena,  most  do  not.  Indeed,  from  an  evolutionary  point  of  view, 
inference  articulation  is  far  less  important  than  inference  accuracy. 


This  suggests  that  we  can  perform  simple  yet  interesting  associative 
inferences  on  FCMs.  Indeed  we  observe  that  each  matrix  has  the  bipolar 
form  that  is  so  important  for  threshold-linear  dynamics.  So  let  us  proceed 
as  follows.  As  a  first  approximation,  let  each  concept  node  Cj  be  either  on 
or  off  (1  or  0)  at  any  given  time.  The  simplest  rule  for  deciding  whether 
Cj  fires  is  the  threshold-linear  rule:  if  the  gated  summed  inputs  to 
exceed  0,  then  turns  on;  else  off.  (Incidentally  this  threshold  law, 
unlike  the  BAM  and  Hopfield  threshold  laws,  approximates  the  passive 
exponential  decay  of  causal  activation.)  In  synchronous  operation  we 
therefore  have  reduced  FCM  operations  to  vector  operations: 


si+1 


F^)  . 


where  is  the  binary  FCM  state  vector  at  iteration  i.  Since  each  FCM  F^ 
is  nonsymmetric,  we  expect  to  observe  rich  dynamical  behavior  in  terms  of 
stable  limit  cycles.  In  this  setting,  however,  limit  cycles  (generalized 
fixed  points)  are  quite  welcome.  They  are  temporal  predictions,  forecasts 
of  sequences  of  events.  Since  all  the  weights  in  each  F^  are  in  [-1,  1],  we 
do  not  expect  that  any  given  F^  will  possess  many  stable  limit  cycles 
relative  to  the  number  of  nodes.  In  other  words,  the  perceived  regularity 
of  responses  of  the  experts  to  what-if  questions  (state  vectors)  corresponds 
to  mapping  the  2n  possible  questions  to  no  more  than  n  answers. 

The  next  problem  is  how  to  deal  with  the  different  nodes, causal 
concepts,  the  experts  discuss.  The  ith  expert  includes  ni  nodes  in  his  FCM 
F^.  In  general  the  node  sets  and  Nj  overlap,  i.e.,  the  experts  tend  to 
discuss  many  of  the  same  concepts  in  their  causal  explanations.  We  view 
this  in  a  simple  way.  We  assume  every  expert  discusses  every  node  (all  the 
nodes  in  the  union  of  N1 ,  .  .  .  ,  Nk).  However,  many  of  these  nodes  «»*-<. 
effectively  undiscussed  because  the  expert  believes  they  are  not  causally 
connected  to  any  other  nodes. 

In  summary,  we  augment  the  FCM  matrices  F^  to  include  all  the  nodes 
discussed  by  all  the  experts,  nodes  C^ ,  .  .  .  ,  Cn.  The  rows  and  columns  of 
each  augmented  connection  matrix  M^  are  suitably  permuted  to  bring  them  into 
mutual  coincidence. 

The  knowledge  combination  procedure  is  now  clear.  The  simplest  way  to 
superimpose  the  fuzzy  bipolar  matrix  memories  M^  is  to  add  them  together 
pointwise: 


This  combination  technique  amounts  to  an  intuitive  voting  scheme.  If  50 
experts  say  causes  (+1)  Cj  and  50  say  causally  decreases  (-1)  Cj,  then 
the  synthesized  e^j  -  m^j  -  0.  In  general  m^j  reflects  the 

prepondenerance  of  excitatory  over  inhibitory  connections,  or  inhibitory 
over  excitatory  connections. 

We  then  conjecture  that  M  embodies  certain  hidden  patterns.  A  hidden 
pattern  is  a  resonant  or  equilibrium  state  of  the  activated  FCM  M:  P  - 
M(P),  where  P  is  a  finite  limit  cycle.  An  intuitive  interpretation  of  a 
hidden  pattern  is  that  it  is  the  consensus  eventually  reached  by  a  round¬ 
table  discussion  among  experts.  A  topic  or  situation  (state  vector)  is 
proposed,  then  fairly  soon  a  rough  agreement  is  reached.  The  point  is  that 
the  final  agreement  may  differ  from  the  complete  position  of  each  expert. 

It  emerges  from  associative  group  interaction.  Nor  need  the  final  consensus 
be  a  unanimous  opinion  (fixed  point).  It  can  be  agreed  upon  sequence  or  set 
of  conditions,  or  a  clear-cut  disagreement,  all  of  which  intuitively 
correspond  to  a  stable  limit-cycle.  The  task  is  to  decode  the  hidden 
patterns  in  M's  edges. 

The  basic  quantitative  relationship  that  governs  the  dynamical  shape  of 
the  hidden  patterns  of  M  is  the  inverse  relationship  between  symmetry  of  M 
and  occurrence  of  limit  cyles  The  more  symmetric  M--the  closer  M  to  MT-- 
the  fewer  and  the  shorter  the  limit  cycles  among  the  hidden  patterns. 


Symmetry  up,  limit  cycles  down.  For  instance,  if  M  -  MT,  a  simple 
unidirectional  Lyapunov  argument  (discrete  version  of  the  Cohen-Grossberg 
Theorem)  shows  that  all  hidden  patterns  are  fixed  vectors.  The  less 
symmetric  M  is,  the  more  complicated  the  feedback  loops  in  M,  and  thus  the 
greater  change  of  oscillation. 

We  now  derive  a  rough  estimate  of  the  frequency  and  length  of  limit 
cycles  "hidden"  in  M.  Fix  the  total  number  of  nodes  discussed  by  the  k-many 
experts  at  n  but  let  k  vary.  The  fewer  experts  there  are  discussing  the 
same  concepts,  the  sparser  each  augmented  FCM  tends  to  be,  and  hence  the 
sparser  M  tends  to  be.  But  the  sparser  M  is,  the  more  M  approximates  a 
symmetric  matrix,  since  the  more  often  m^j  -  m^  «  0  tends  to  occur. 
Similarly,  the  more  experts  there  are  relative  to  the  number  of  concepts 
discussed,  the  less  sparse  MA  tends  to  be  and  thus  the  less  symmetric  M 
tends  to  be.  A  similar  conclusion  follows  when  k  is  fixed  and  n  is  varied. 
Hence  the  dynamics  of  M  are  driven  by  its  symmetry;  in  turn  the  symmetry  of 
M  is  driven  by  the  ratio  k/n.  Therefore  if  L  denotes  the  expected 
frequency/length  of  limit  cycles,  L  can  be  approximated  by 

L  ~  k/n 

Suppose  now  each  expert  i  has  a  credibility  weight  w^  in  [0,  1].  How 
do  we  form  the  weighted  augment  FCM  matrices  M^w?  Since  M  is  formed  by 
summing  the  M^  matrices,  the  natural  operation  for  "gating"  the  knowledge  of 
i  by  w^  is  simply  to  multiply  the  connections  in  M^  by  w^: 


"i- 

i  i 

Hence  if  i  is  highly  credible  (w^  is  near  1),  will  be  relatively  well 
represented  in  M.  If  i  is  incredible  (w^  is  near  0),  will  make  little 
contribution  to  M. 

How  do  the  weights  w  -  (w^ ,  .  .  .  ,  wk)  affect  Lw,  the  limit  cycle 
behavior  of  Mw?  Note  that  if  all  w^  -  1 ,  then  k  -  w1  +  .  .  .  +  wk  «  W, 
where  W  denotes  the  sum  of  weights.  Otherwise,  k  >  W.  On  average,  the 
smaller  W  is  relative  to  k,  the  smaller  each  w^  tends  to  be;  hence,  the 
smaller  the  edges  in  M  tend  to  be;  hence,  the  closer  M'  is  to  M;  hence,  the 
more  symmetric  M  tends  to  be.  So,  generalizing  the  above  estimate  for  L,  we 
can  approximate  Lw  by 

Lw  =  W/n  . 

6.  BAM  Simulation  and  Demonstration.  Robert  Sasseen  successfully 
demonstrated  both  unidirectional  and  bidirectional  bivalent  associative 
memories  on  VERAC’s  Texas  Instruments  Explore  AI  Workstation.  This  graphics 
intensive  software  is  written  in  the  object-oriented  programming  language 
FLAVORS,  which  seems  especially  appropriate,  as  well  commodius,  for 
representing  network  behavior.  We  mention  that  Robert  has  developed  several 
other  Explorer  simulations  of  much  more  complex  network  behavior.  Currently 
the  utility  of  using  FLAVORS  to  model  networks  on  the  Explorer  is  heavily 
constrained  by  the  Explorer’s  arithmetic  processing  capabilities.  To 


relieve  these  constraints,  VERAC's  Adaptive  Systems  Group  has  entered  an 


agreement  with  TI  to  beta-test  their  new  Odyssey  Board  (containing  four  TMS- 
32020  DSP  processors)  on  the  Explorer.  We  have  agreed  to  receive  the 
Odyssey  Board  in  November  1986.  We  expect  that  this  neural  network 
accelerator  will  greatly  increase  our  ability  to  test  new  network  theories 
and  hypotheses. 

At  the  American  Association  for  Artificial  Intelligence  (AAAI) 
Conference  in  Philiadelphia  in  early  August  1986,  Robert  successfully 
demonstrated  some  of  these  simulations  at  the  TI  exhibit  booth.  The 
responses  were  quite  positive. 

While  at  the  AAAI  conference  in  Philadelphia,  Bart  and  Robert  were 
given  a  tour  of  Nabil  Farhat's  optics  lab  at  the  University  of  Pennsylvania. 
The  tour  was  thorough  and  courteous,  and  we  have  invited  Nabil  Farhat  to 
tour  the  VERAC-UCSD  OCCAM  facilities. 


III.  OCCAM  AT  UCSD 


This  section  summarizes  the  results  of  the  1st  Quarter  OCCAM  research 
effort  at  UCSD's  Optics  Lab.  The  principle  points  are  six:  (1)  recruitment 
of  project  personnel,  (2)  design  and  simulation  of  a  new  form  of  all-opticol 
dynamical  associative  memory,  (3)  design  and  simulation  of  a  translation, 
rotation,  and  scale  invariant  optical  preprocessor  suitable  for  pattern 


recognition  by  associative  memory,  (4)  design  and  systems  for  performing 
optical  minimum  and  maximum  operations  that  are  fundamental  to 
implementations  of  fuzzy  logic/sets,  (5)  design  and  construction  of  an 
associative  memory  demonstration  computer  board,  and  presentation  of  a 
tutorial  titled  "Holographic  Approach  to  Associative  Memory." 


1.  Recruitment  of  OCCAM  personnel.  Assistant  Professor  Clark  Guest  directs 
the  Of  CAM  effort  at  UCSD.  When  OCCAM  commenced,  Clark  had  recruited  three 
Ph.D.  candidate  graduate  students  in  the  department  of  Electrical 
Engineering  and  Computer  Science  (EECS)  to  participate  in  the  project. 

Those  students  are  Myung  Soo  Kim,  Robert  Te  Kolste,  and  Hedong  Yang.  Clark 
personally  grounded  all  three  in  associative  memory  and  neural  net 
methodology  through  independent  sutdy  projects;  they  also  successfully 
completed  Bart's  Fuzzy  Theory  course. 


2.  New  Optical  Dynamical  Neural  Network.  A  new  optical  implementation  of  a 
crossbar  associative  network  with  feedback,  the  type  studied  by  Kohonen, 
Amari,  Grossberg,  et  al  and  made  popular  by  Hopfield,  has  been  designed  and 
simulated.  The  design  uses  the  sigmoidal  response  of  the  Hughes  Liquid 
Crystal  Light  Valve  (LCLV)  device  to  implement  the  threshold  neuron 
processing  elements. 


The  LCLV  modulates  light  through  a  birefringent  polarization  conversion 
that  nominally  yields  a  sine  squared  output  intensity  response  to  on 
increasing  input  intensity.  Proper  electrical  biasing  ensures  that  the 
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output  saturates  at  the  first  peak  of  the  sine  squared  curve,  thereby 
yielding  the  approximately  sigmoidal  response  necessary  for  noise- 
quenching/signal-enhancing  processing  element  behavior. 

The  crossbar  interconnection  of  nodes  is  achieved  with  an  optical 
matrix-vector  multiplier  designed  around  the  LCLV,  as  shown  in  Figure  1 
attached.  The  matrix  of  connection  strengths  is  imaged  onto  the  output  side 
of  the  1CLV  with  linearly  polarized  light.  A  uniform  beam  with  the 
orthogonal  polarization  is  also  shown  onto  the  side  of  the  LCLV.  Bipolar 
connection  weights  are  achieved  through  a  comparison  of  the  two  beams. 

Where  the  matrix  image  intensity  exceeds  the  uniform  beam  intensity  a 
positive  weight  is  coded;  otherwise,  a  negative  weight  is  coded. 

The  current  system  has  several  advantages  over  other  electrooptical 
implementation  of  feedback  associative  memories.  Connection  matrix  weights 
are  entered  as  light  intensity,  an  image  on  the  face  of  a  CRT  will  suffice. 
Thus  connection  strengths  can  be  readily  changed  according  to  any  desired 
adaptation  algorithm.  Bipolar  connection  weights  are  achieved  without 
requiring  separate  display  elements  of  the  positive  and  negative  values. 

The  LCLV  intrinsically  provides  the  nonlinear  response  of  the  neurons, 
thereby  eliminating  the  need  for  optical-to-electronic  and  electronic-to- 
optical  conversions  on  every  iteration.  The  LCLV  is  a  high  resolution 
device,  and  should  eventually  be  able  to  support  500  or  more  neuron 
elements . 

A  computer  simulation  of  the  system  was  coded  in  the  programming 
language  C.  The  simulation  incorporated  not  only  the  general  feedback 


associative  memory  architecture,  but  the  sine  squared  characterisic  of  the 
LCLV ,  the  dynamical  time  response  of  the  LCLV,  and  the  polarization  encoding 
of  bipolar  weights  as  well.  Simulation  results  show  that  the  LCLV  feedback 
asociative  memory  dependably  converges  to  correct  recall  within  a  few 
response  times  of  the  LCLV.  Since  in  the  2nd  Quarter  of  the  OCCAM  program 
conergence  was  proved  for  the  continuous  version  of  the  BAM,  an  LCLV 
implemenation  seems  promising. 

Based  on  these  results,  Hughes  was  approached  and  subsequently 
consented  to  donate  a  LCLV  to  the  OCCAM  project.  Characterization 
measurements  are  currently  being  conducted  on  the  device,  and  experimental 
implementation  of  the  feedback  associative  memory  will  begin  soon. 


The  first  phase  will  be  to  use  a  traditional  optical  preprocessing 
system  to  achieve  translation,  scale,  and  rotation  invariant  feature 
extraction.  The  feature  values  will  serve  as  inputs  to  an  associative 
memory  that  will  perform  image  classification.  In  the  second  phase, 


invariance  operations  will  be  incorporated  into  the  associative  part  of  the 


system,  simplifying  and  eventually  eliminating  the  optical  preprocessor .  A 
comparison  of  the  approaches  will  be  made  to  determine  which  tasks  are  most 
efficiently  carried  out  through  preprocessing  and  which  tasks  are 
appropriate  for  associative  systems. 

Pursuant  to  the  first  phase  objective,  an  optical  preprocessor  capable 
of  translation,  scale,  and  rotation  invariant  feature  extraction  has  been 
designed.  This  preprocessor  is  represented  in  Figure  2  attached. 

Translation  invariance  is  achieved  in  the  first  stoge  by  taking  the 
magnitude  (with  the  LCLV)  of  the  two-dimensional  optical  Fourier  transform 
of  the  object.  The  processor  second  stage  is  a  phase-coded  matched  filter 
that  uses  circular  harmonics  as  rotationally  invariant  features.  Radial 
moments  rk,  k  »  1,  2,  .  .  .  ,  of  the  circular  harmonics  are  taken.  When  the 
scale  of  the  input  object  is  changed,  all  moments  are  scaled  in  a 
predictable  way.  Use  of  an  on-center  off-surround  competitive  feature 
detector  will  compensate  for  this  feature  scaling,  thus  permitting  invariant 
recognition . 

The  use  of  circular  moments  for  scale  and  rotation  invariant  feature 
extraction  has  been  simulated  with  a  computer  model.  The  four  alphabetic 
letters  A,  E,  F,  and  R  were  used  as  input  images.  Twenty-five  circular 
moments  have  been  calculated  for  each  letter  in  a  variety  of  scales  and 
rotations.  The  results  demonstrate  good  intraclass  recognition  and 
interclass  discrimination.  Currently,  the  method  of  mutual  information  is 
being  used  to  select  a  smaller  set  of  moments  that  will  be  used  in  the 


planned  otpical  implementation.  Simulation  of  associative  memory 
classification  of  detected  features  is  also  proceeding. 


4.  Optical  Min  and  Max  Fuzzy  Operators.  Optical  implementations  of  fuzzy 
logic  and  fuzzy  cognitive  maps  (FCMs)  is  a  key  objective  of  the  OCCAM 
project.  The  operations  of  maximum  and  minimum  play  roles  in  fuzzy  logic 
computations,  as  discussed  above  in  the  section  on  FAMs,  that  are  parallel 
to  the  roles  of  addition  and  multiplication  in  matrix  algebra.  Many  fuzzy 
operations,  e.g.,  the  compositional  rule  of  inference,  can  be  cast  as  matrix 
vector  products  with  max  and  min  substituting  for  sum  and  product.  Optical 
implementations  of  matrix  vector  multipliers  are  well  known.  Recent  work  in 
OCCAM  has  identified  optical  implementations  of  the  max  and  min  operations 
that  can  be  incorporated  into  fuzzy  logic  processors. 

There  are  two  basic  approaches  to  optical  implementation  of  min  and 
max.  The  first  is  an  indirect  approach.  An  optical  implementation  of  the 
Boolean  test  (A  >  B)  is  performed  bitwise  parallel  on  two  data  pages.  The 
binary  mask  obtained  from  this  operation  is  applied  to  data  page  A,  the 
complement  of  the  mask  is  applied  to  page  B.  The  two  resulting  images  are 
then  combined,  yielding  an  image  with  the  bitwise  maximum  of  data  pages  A 
and  B.  If  the  mask  and  its  complement  are  interchanged,  the  bitwise  minimum 
is  formed. 

The  second  approach  yields  directly  a  bitwise  max  or  min  data  page, 
with  the  intermediate  masking  steps.  It  is  based  on  the  identifies 
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which  follow  by  checking  the  three  cases  a  -  b,  a  <  b,  and  a  >  b.  An 
optical  implementation  of  this  approach  is  shown  in  Figure  3.  The  system 
uses  coherent  subtraction,  and  implements  the  absolute  value  operation  with 
an  LCLV,  which  must  be  operated  in  the  linear  range  of  its  response  curve. 


5.  Associative  Memory  Demonstration  Board.  For  eduational  and 
demonstration  purposes,  a  neural  net  demonstrator  device  has  been  designed 
and  built.  The  device  consists  of  a  single  board  microcontroller  and  a 
custom  designed  display  board.  The  LED  display  can  indicate  pairs  of  input 
and  output  vectors  that  have  been  associated  in  memory.  A  trial  input 
vector  may  be  supplied  by  the  user,  and  an  output  vector  pattern  is 
generated.  The  microcontroller  is  fully  programmable  and  many  memorization 
and  recall  algorithms  can  be  implemented,  included  BAM  and  optimal  linear 
heteroassociators.  THe  board  is  currently  being  programmed,  and  the 
hardware  is  being  tested. 


6.  SPIE  Tutorial.  At  the  invitation  of  SPIE,  Clark  presented  a  half  day 
tutorial  entitled  "Holographic  Approach  to  Associative  Memory"  on  17  August 
1986.  The  tutorial  was  attended  by  ten  people  and  well  received.  A  copy  of 
the  notes  is  attached  to  thi6  RAD  Status  Report. 
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Finally,  an  optical  volume  holographic  associative  memory  has  been 
designed,  and  analysis  of  its  characteristics  has  begun.  Other 
implementations  of  associative  memories  in  photorefractive  crystals  are 
under  investigation  by  other  groups,  but  they  make  no  use  of  phase  encoding 
of  the  associated  beams.  This  is  an  important  aspect  of  any  system  that 
will  fully  use  the  storage  capacity  of  volume  phase  holograms,  and  is  the 
current  center  of  our  attention.  Experiments  with  association  of  phase 
encoded  beams  in  photorefractive  crystals  are  planned. 
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Figure  1)  LCLU  Feedback  Rssociative  Memory 
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Figure  2)  Inuariant  feature  optical  preprocessor 


